by Peter de Blanc + ChatGPT o1 pro
Posted to Adarie (www.adarie.com ) on March 19, 2025
Content License: Creative Commons BY (Attribution)
First, let us state the problem precisely:
You have a Boltzmann machine whose visible/hidden units x i x_i x i take values in { 0 , 1 } \{0,1\} { 0 , 1 } .
Its energy function is of the form
E ( x ) = − ∑ i b i x i − ∑ i < j w i j x i x j . E(\mathbf{x})
\;=\;
-\sum_{i} b_i\,x_i \;-\;\sum_{i<j} w_{ij}\,x_i\,x_j. E ( x ) = − i ∑ b i x i − i < j ∑ w ij x i x j .
You now want to switch to “Ising spins” s i s_i s i that take values in { − 1 , + 1 } \{-1,+1\} { − 1 , + 1 } instead, but keep exactly the same probability distribution (up to an overall constant in the energy that does not affect probabilities).
The usual relationship between the original bits x i ∈ { 0 , 1 } x_i \in \{0,1\} x i ∈ { 0 , 1 } and Ising spins s i ∈ { − 1 , + 1 } s_i \in \{-1,+1\} s i ∈ { − 1 , + 1 } is
s i = 2 x i − 1 ⟺ x i = s i + 1 2 . s_i \;=\; 2\,x_i \;-\; 1
\quad\Longleftrightarrow\quad
x_i \;=\; \frac{s_i + 1}{2}. s i = 2 x i − 1 ⟺ x i = 2 s i + 1 .
Below, we walk through how to find the new parameters b ~ i \widetilde{b}_i b i and w ~ i j \widetilde{w}_{ij} w ij so that
E ′ ( s ) = − ∑ i b ~ i s i − ∑ i < j w ~ i j s i s j E'(\mathbf{s})
\;=\;
-\sum_i \widetilde{b}_i\,s_i \;-\;\sum_{i<j} \widetilde{w}_{ij}\,s_i\,s_j E ′ ( s ) = − i ∑ b i s i − i < j ∑ w ij s i s j
differs from the original energy E ( x ( s ) ) E(\mathbf{x}(\mathbf{s})) E ( x ( s )) by only a constant (ensuring the same Boltzmann distribution).
1. Rewrite the old energy in terms of s \mathbf{s} s
Start with
E ( x ) = − ∑ i b i x i − ∑ i < j w i j x i x j . E(\mathbf{x})
\;=\;
-\,\sum_i b_i\,x_i
\;-\;
\sum_{i<j} w_{ij}\,x_i\,x_j. E ( x ) = − i ∑ b i x i − i < j ∑ w ij x i x j .
Substitute x i = s i + 1 2 x_i = \tfrac{s_i + 1}{2} x i = 2 s i + 1 . Then
E ( x ( s ) ) = − ∑ i b i ( s i + 1 2 ) − ∑ i < j w i j ( s i + 1 2 ) ( s j + 1 2 ) . E\bigl(\mathbf{x}(\mathbf{s})\bigr)
\;=\;
-\,\sum_i b_i \,\Bigl(\frac{s_i + 1}{2}\Bigr)
\;-\;
\sum_{i<j} w_{ij}\;\Bigl(\frac{s_i + 1}{2}\Bigr)\,\Bigl(\frac{s_j + 1}{2}\Bigr). E ( x ( s ) ) = − i ∑ b i ( 2 s i + 1 ) − i < j ∑ w ij ( 2 s i + 1 ) ( 2 s j + 1 ) .
We can expand this step by step.
1.1 Expand the linear term
− ∑ i b i s i + 1 2 = − ∑ i ( b i 2 s i + b i 2 ) = − ∑ i b i 2 s i − ∑ i b i 2 . -\sum_i b_i \,\frac{s_i + 1}{2}
\;=\;
-\sum_i \Bigl(\frac{b_i}{2}\,s_i + \frac{b_i}{2}\Bigr)
\;=\;
-\sum_i \frac{b_i}{2}\,s_i
\;-\;
\sum_i \frac{b_i}{2}. − i ∑ b i 2 s i + 1 = − i ∑ ( 2 b i s i + 2 b i ) = − i ∑ 2 b i s i − i ∑ 2 b i .
1.2 Expand the pairwise term
− ∑ i < j w i j ( s i + 1 2 ) ( s j + 1 2 ) = − ∑ i < j w i j 4 ( s i + 1 ) ( s j + 1 ) . -\sum_{i<j} w_{ij}\;\Bigl(\frac{s_i + 1}{2}\Bigr)\,\Bigl(\frac{s_j + 1}{2}\Bigr)
\;=\;
-\sum_{i<j}
\frac{w_{ij}}{4}\;\bigl(s_i + 1\bigr)\bigl(s_j + 1\bigr). − i < j ∑ w ij ( 2 s i + 1 ) ( 2 s j + 1 ) = − i < j ∑ 4 w ij ( s i + 1 ) ( s j + 1 ) .
Now expand ( s i + 1 ) ( s j + 1 ) = s i s j + s i + s j + 1 \bigl(s_i + 1\bigr)\bigl(s_j + 1\bigr) = s_i s_j + s_i + s_j + 1 ( s i + 1 ) ( s j + 1 ) = s i s j + s i + s j + 1 . Hence
− ∑ i < j w i j s i s j + s i + s j + 1 4 = − ∑ i < j ( w i j 4 s i s j + w i j 4 s i + w i j 4 s j + w i j 4 ) . -\sum_{i<j} w_{ij}\,\frac{s_i s_j + s_i + s_j + 1}{4}
\;=\;
-\sum_{i<j} \Bigl(
\frac{w_{ij}}{4}\,s_i s_j
\;+\;
\frac{w_{ij}}{4}\,s_i
\;+\;
\frac{w_{ij}}{4}\,s_j
\;+\;
\frac{w_{ij}}{4}
\Bigr). − i < j ∑ w ij 4 s i s j + s i + s j + 1 = − i < j ∑ ( 4 w ij s i s j + 4 w ij s i + 4 w ij s j + 4 w ij ) .
Combine everything:
E ( x ( s ) ) = − ∑ i b i 2 s i − ∑ i < j w i j 4 s i s j − ∑ i b i 2 − ∑ i < j w i j 4 s i − ∑ i < j w i j 4 s j − ∑ i < j w i j 4 . E\bigl(\mathbf{x}(\mathbf{s})\bigr)
\;=\;
-\sum_i \frac{b_i}{2}\,s_i
\;-\;
\sum_{i<j} \frac{w_{ij}}{4}\,s_i s_j
\;-\;
\sum_i \frac{b_i}{2}
\;-\;
\sum_{i<j} \frac{w_{ij}}{4}\,s_i
\;-\;
\sum_{i<j} \frac{w_{ij}}{4}\,s_j
\;-\;
\sum_{i<j} \frac{w_{ij}}{4}. E ( x ( s ) ) = − i ∑ 2 b i s i − i < j ∑ 4 w ij s i s j − i ∑ 2 b i − i < j ∑ 4 w ij s i − i < j ∑ 4 w ij s j − i < j ∑ 4 w ij .
2. Group terms by type
We want to match
E ′ ( s ) = − ∑ i b ~ i s i − ∑ i < j w ~ i j s i s j + constant . E'\bigl(\mathbf{s}\bigr)
\;=\;
-\sum_i \widetilde{b}_i\,s_i
\;-\;
\sum_{i<j} \widetilde{w}_{ij}\,s_i s_j
\;+\;\text{constant}. E ′ ( s ) = − i ∑ b i s i − i < j ∑ w ij s i s j + constant .
So let us group together the ∑ i s i \sum_i s_i ∑ i s i terms, the ∑ i < j s i s j \sum_{i<j} s_i s_j ∑ i < j s i s j terms, and constants.
2.1 Pairwise terms in s i s j s_i s_j s i s j
From above, the coefficient in front of s i s j s_i s_j s i s j is − w i j 4 - \tfrac{w_{ij}}{4} − 4 w ij . Hence we identify
w ~ i j = w i j 4 . \widetilde{w}_{ij} \;=\; \frac{w_{ij}}{4}. w ij = 4 w ij .
2.2 Linear terms in s i s_i s i
Look at all terms linear in s i s_i s i . There are two sources:
From − ∑ i b i 2 s i -\sum_i \tfrac{b_i}{2}\,s_i − ∑ i 2 b i s i , we get a contribution − b i 2 s i -\tfrac{b_i}{2}\,s_i − 2 b i s i .
From − ∑ i < j w i j 4 ( s i + s j ) -\sum_{i<j} \tfrac{w_{ij}}{4} (s_i + s_j) − ∑ i < j 4 w ij ( s i + s j ) , each s i s_i s i gets contributions from all j ≠ i j\neq i j = i .
If j > i j > i j > i , you see a term − w i j 4 s i -\tfrac{w_{ij}}{4}\,s_i − 4 w ij s i .
If j < i j < i j < i , that same w j i = w i j w_{ji} = w_{ij} w ji = w ij appears in the sum with − w j i 4 s j -\tfrac{w_{ji}}{4}\,s_j − 4 w ji s j but re-labeled as s i s_i s i for your index.
Putting it all together, for each i i i :
− b i 2 s i − ∑ j ≠ i w i j 4 s i = − ( b i 2 + 1 4 ∑ j ≠ i w i j ) s i . -\frac{b_i}{2}\,s_i
\;-\;
\sum_{j\neq i} \frac{w_{ij}}{4}\,s_i
\;=\;
-\biggl(
\frac{b_i}{2} + \frac{1}{4}\sum_{j\neq i} w_{ij}
\biggr)\,s_i. − 2 b i s i − j = i ∑ 4 w ij s i = − ( 2 b i + 4 1 j = i ∑ w ij ) s i .
Hence,
b ~ i = b i 2 + 1 4 ∑ j ≠ i w i j . \widetilde{b}_i
\;=\;
\frac{b_i}{2}
\;+\;
\frac{1}{4}\;\sum_{j\neq i} w_{ij}. b i = 2 b i + 4 1 j = i ∑ w ij .
2.3 Constant terms
These do not affect the probabilities (they only multiply the partition function by a constant). Collecting them:
− ∑ i b i 2 − ∑ i < j w i j 4 . -\sum_i \frac{b_i}{2}
\;-\;
\sum_{i<j} \frac{w_{ij}}{4}. − i ∑ 2 b i − i < j ∑ 4 w ij .
So overall, in terms of a new energy E ′ ( s ) E'(\mathbf{s}) E ′ ( s ) , we can drop that constant or just keep in mind it is an additive shift.
3. Final transformation
Putting it all together:
Spin re-labeling : s i = 2 x i − 1 s_i = 2x_i - 1 s i = 2 x i − 1 .
New couplings :
w ~ i j = w i j 4 . \widetilde{w}_{ij}
\;=\;
\frac{w_{ij}}{4}. w ij = 4 w ij .
New biases :
b ~ i = b i 2 + 1 4 ∑ j ≠ i w i j . \widetilde{b}_i
\;=\;
\frac{b_i}{2}
\;+\;
\frac{1}{4}\,\sum_{j \neq i} w_{ij}. b i = 2 b i + 4 1 j = i ∑ w ij .
Additive constant (often omitted in the final expression) is
constant = − ∑ i b i 2 − ∑ i < j w i j 4 . \text{constant}
\;=\;
-\,\sum_i \frac{b_i}{2}
\;-\;
\sum_{i<j} \frac{w_{ij}}{4}. constant = − i ∑ 2 b i − i < j ∑ 4 w ij .
Hence the Ising-form energy in s ∈ { − 1 , + 1 } n \mathbf{s}\in\{-1,+1\}^n s ∈ { − 1 , + 1 } n that reproduces the same Boltzmann distribution as the original { 0 , 1 } \{0,1\} { 0 , 1 } -valued Boltzmann machine is:
E ′ ( s ) = − ∑ i ( b i 2 + 1 4 ∑ j ≠ i w i j ) s i − ∑ i < j w i j 4 s i s j + (constant) . E'(\mathbf{s})
\;=\;
-\sum_{i}
\biggl(
\frac{b_i}{2}
+
\tfrac{1}{4}\,\sum_{j \neq i} w_{ij}
\biggr)\,s_i
\;-\;
\sum_{i<j} \frac{w_{ij}}{4}\,s_i\,s_j
\;+\;
\text{(constant)}. E ′ ( s ) = − i ∑ ( 2 b i + 4 1 j = i ∑ w ij ) s i − i < j ∑ 4 w ij s i s j + (constant) .
Summary of the Parameter Transformation
Spin : s i = 2 x i − 1 s_i = 2x_i - 1 s i = 2 x i − 1 .
Couplings : w ~ i j = 1 4 w i j \displaystyle \widetilde{w}_{ij} = \tfrac{1}{4}w_{ij} w ij = 4 1 w ij .
Biases : b ~ i = 1 2 b i + 1 4 ∑ j ≠ i w i j \displaystyle \widetilde{b}_i = \tfrac12\,b_{i} + \tfrac14\sum_{j \neq i} w_{ij} b i = 2 1 b i + 4 1 j = i ∑ w ij .
(Optional) Additive constant: − ∑ i b i 2 − ∑ i < j w i j 4 \displaystyle - \sum_i \frac{b_i}{2} - \sum_{i<j} \frac{w_{ij}}{4} − i ∑ 2 b i − i < j ∑ 4 w ij .
These formulas preserve the same Boltzmann distribution but rewrite it for spins taking values in { − 1 , + 1 } \{-1,+1\} { − 1 , + 1 } .